Automatic Speech Recognition-based Information Center for Indonesian Language

نویسنده

  • Sri Hartati
چکیده

This paper discusses an implementation of CMU Sphinx-4 in an automatic speech recognition-based information center (ASRIC) for Indonesian language. The ASRIC uses a vector space model (VSM) to improve the performance of statistical language model and to recognize the user utterance more flexibly. Testing to an Indonesian speaker shows that VSM is capable of reducing the query error rate in acceptable response time. The VSM also make ASRIC more flexible to recognize user’s spoken sentences which misses some keywords, contains random ordered words, or consists of unimportant words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...

متن کامل

Development of Indonesian Large Vocabulary Continuous Speech Recognition System within A-STAR Project

The paper outlines the development of a large vocabulary continuous speech recognition (LVCSR) system for the Indonesian language within the Asian speech translation (A-STAR) project. An overview of the A-STAR project and Indonesian language characteristics will be briefly described. We then focus on a discussion of the development of Indonesian LVCSR, including data resources issues, acoustic ...

متن کامل

Target Structured Cross Language Model Refinement

The task of porting Automatic Speech Recognition (ASR) technology to many languages is hindered by a lack of transcribed acoustic data, which in turn prevents the development of accurate acoustic models necessary for the recognition task. To overcome this problem, recent research has sought to exploit the similarity of sounds across languages, and use this similarity to adapt models from one or...

متن کامل

Cross Lingual Modelling Experiments for Indonesian

The extension of Large Vocabulary Continuous Speech Recognition (LVCSR) to resource poor languages such as Indonesian is hindered by the lack of transcribed acoustic data and appropriate pronunciation lexicons. Research has generally been directed toward establishing robust cross-lingual acoustic models, with the assumption that phonetic lexicons are readily available. This is not the case for ...

متن کامل

Integrated Natural Language Call Routing

We present a new integrated approach for natural language call routing based on stochastic language models. The system learns automatically from examples to direct a call to the appropriate destination within a call center. It employs stochastic language models for each call destination. The language models are generated by a language model adaptation algorithm based on the minimum discriminati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013